A Novel Generalized Value Iteration Scheme For Partially-Unknown Continuous-Time Linear Systems

نویسندگان

  • Jae Young Lee
  • Jin Bae Park
  • Yoon Ho Choi
چکیده

In this paper, a novel generalized value iteration technique is presented for solving online the discounted linear quadratic (LQ) optimal control problems for continuous-time (CT) linear systems with an unknown system matrix A. In the proposed method, the discounted value function is considered, which is a general setting in reinforcement learning (RL) frameworks, but not fully considered in RL for CT dynamical systems. Moreover, stepwise-varying learning rate ηi is introduced for the fast and safe convergence. For the stability and monotone convergence to the true optimal solution, it is mathematically proven that if stepwise-varying learning rate ηi lies in some specified ranges, the proposed algorithm guarantees the so-called Hurwitz property concerning the stability of the closed loop system, and, in addition, converges to the discounted LQ optimal solution. These proofs also give the stability and monotone convergence conditions for the existing value iteration method as a special case since the proposed method is more general than the existing ones.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analytical and Verified Numerical Results Concerning Interval Continuous-time Algebraic Riccati Equations

This paper focuses on studying the interval continuous-time algebraic Riccati equation A∗X + XA + Q − XGX = 0, both from the theoretical aspects and the computational ones. In theoretical parts, we show that Shary’s results for interval linear systems can only be partially generalized to this interval Riccati matrix equation. We then derive an efficient technique for enclosing the united stable...

متن کامل

Optimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics

In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...

متن کامل

Eigenvalue Assignment Of Discrete-Time Linear Systems With State And Input Time-Delays

Time-delays are important components of many dynamical systems that describe coupling or interconnection between dynamics, propagation or transport phenomena, and heredity and competition in population dynamics. The stabilization with time delay in observation or control represents difficult mathematical challenges in the control of distributed parameter systems. It is well-known that the stabi...

متن کامل

Newton iterations in implicit time-stepping scheme for differential linear complementarity systems

We propose a generalized Newton method for solving the system of nonlinear equations with linear complementarity constraints in the implicit or semi-implicit time-stepping scheme for differential linear complementarity systems (DLCS). We choose a specific solution from the solution set of the linear complementarity constraints to define a locally Lipschitz continuous right-hand-side function in...

متن کامل

A New Inexact Inverse Subspace Iteration for Generalized Eigenvalue Problems

In this paper, we represent an inexact inverse subspace iteration method for computing a few eigenpairs of the generalized eigenvalue problem Ax = Bx [Q. Ye and P. Zhang, Inexact inverse subspace iteration for generalized eigenvalue problems, Linear Algebra and its Application, 434 (2011) 1697-1715 ]. In particular, the linear convergence property of the inverse subspace iteration is preserved.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011